Learning 5000 Relational Extractors

نویسندگان

Raphael Hoffmann

Congle Zhang

Daniel S. Weld

چکیده

Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. However, the primary approach (supervised learning of relation-specific extractors) requires manually-labeled training data for each relation and doesn’t scale to the thousands of relations encoded in Web text. This paper presents LUCHS, a self-supervised, relation-specific IE system which learns 5025 relations — more than an order of magnitude greater than any previous approach — with an average F1 score of 61%. Crucial to LUCHS’s performance is an automated system for dynamic lexicon learning, which allows it to learn accurately from heuristically-generated training data, which is often noisy and sparse.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distant Supervised Relation Extraction with Wikipedia and Freebase

In this paper we discuss a new approach to extract relational data from unstructured text without the need of hand labeled data. Socalled distant supervision has the advantage that it scales large amounts of web data and therefore fulfills the requirement of current information extraction tasks. As opposed to supervised machine learning we train generic, relationand domain-independent extractor...

متن کامل

Data Mining on Symbolic Knowledge Extracted from the Web

Information extractors and classifiers operating on unrestricted, unstructured texts are an errorful source of large amounts of potentially useful information, especially when combined with a crawler which automatically augments the knowledge base from the world-wide web. At the same time, there is much structured information on the World Wide Web. Wrapping the web-sites which provide this kind...

متن کامل

Interactive Learning of Relation Extractors with Weak Supervision

متن کامل

Mining : Foundations , Techniques and ApplicationsFinite - State Transducers for Semi - Structured Text

Text mining for semi-structured documents requires information extractors. Programming extractors by hand is diicult to catch up with the amount and the variation of the documents placed on the WorldWide Web everyday. This paper presents our recent result on applying machine learning techniques to au-tomatize the generation of the extractors. Our goal is to develop a domain and language indepen...

متن کامل

Learning audio and image representations with bio-inspired trainable feature extractors

Recent advancements in pattern recognition and signal processing concern the automatic learning of data representations from labeled training samples. Typical approaches are based on deep learning and convolutional neural networks, which require large amount of labeled training samples. In this work, we propose novel feature extractors that can be used to learn the representation of single prot...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Learning 5000 Relational Extractors

نویسندگان

چکیده

منابع مشابه

Distant Supervised Relation Extraction with Wikipedia and Freebase

Data Mining on Symbolic Knowledge Extracted from the Web

Interactive Learning of Relation Extractors with Weak Supervision

Mining : Foundations , Techniques and ApplicationsFinite - State Transducers for Semi - Structured Text

Learning audio and image representations with bio-inspired trainable feature extractors

عنوان ژورنال:

اشتراک گذاری